101 research outputs found

    Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs

    Full text link
    Training deep neural networks consumes increasing computational resource shares in many compute centers. Often, a brute force approach to obtain hyperparameter values is employed. Our goal is (1) to enhance this by enabling second-order optimization methods with fewer hyperparameters for large-scale neural networks and (2) to perform a survey of the performance optimizers for specific tasks to suggest users the best one for their problem. We introduce a novel second-order optimization method that requires the effect of the Hessian on a vector only and avoids the huge cost of explicitly setting up the Hessian for large-scale networks. We compare the proposed second-order method with two state-of-the-art optimizers on five representative neural network problems, including regression and very deep networks from computer vision or variational autoencoders. For the largest setup, we efficiently parallelized the optimizers with Horovod and applied it to a 8 GPU NVIDIA P100 (DGX-1) machine.Comment: Accepted to PPAM conferenc

    Free-Surface Lattice-Boltzmann Simulation on Many-Core Architectures

    Get PDF
    AbstractCurrent advances in many-core technologies demand simulation algorithms suited for the corresponding architectures while with regard to the respective increase of computational power, real-time and interactive simulations become possible and desirable. We present an OpenCL implementation of a Lattice-Boltzmann-based free-surface solver for GPU architectures. The massively parallel execution especially requires special techniques to keep the interface region consistent, which is here addressed by a novel multipass method. We further compare different memory layouts according to their performance for both a basic driven cavity implementation and the free-surface method, pointing out the capabilities of our implementation in real-time and interactive scenarios, and shortly present visualizations of the flow, obtained in real-time

    Multi-fidelity Constrained Optimization for Stochastic Black Box Simulators

    Full text link
    Constrained optimization of the parameters of a simulator plays a crucial role in a design process. These problems become challenging when the simulator is stochastic, computationally expensive, and the parameter space is high-dimensional. One can efficiently perform optimization only by utilizing the gradient with respect to the parameters, but these gradients are unavailable in many legacy, black-box codes. We introduce the algorithm Scout-Nd (Stochastic Constrained Optimization for N dimensions) to tackle the issues mentioned earlier by efficiently estimating the gradient, reducing the noise of the gradient estimator, and applying multi-fidelity schemes to further reduce computational effort. We validate our approach on standard benchmarks, demonstrating its effectiveness in optimizing parameters highlighting better performance compared to existing methods

    Efficient Quantification of Model Uncertainties When De-boarding a Train

    Get PDF
    It is difficult to provide live simulation systems for decision support. Time is limited and uncertainty quantification requires many simulation runs. We combine a surrogate model with the stochastic collocation method to overcome time and storage restrictions and show a proof of concept for a de-boarding scenario of a train

    Octrees for Cooperative Work in a Network-Based Environment

    Get PDF
    Assuring global consistency in a cooperative working environment is the main focus of many nowaday research projects in the field of civil engineering and others. In this paper, a new approach based on octrees will be discussed. It will be shown that by the usage of octrees not only the management and control of processes in a network-based working environment can be optimised but also an efficient integration platform for processes from various disciplines – such as architecture and civil engineering – can be provided. By means of an octree-based collision detection resp. consistency assurance a client-server-architecture will be described as well as sophisticated information services for a further support of cooperative work
    • …
    corecore